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Question 1 [15 Marks] 


Answer each of the following multiple choice questions by clearly circling the letter 
corresponding to your answer. 


(i) Which one of the following statements best describes the relationship between 
the median and mean for a skewed dataset? 


A. 


B. 


The mean and median will always be the same. 


The mean and median will usually be the same. 


. The mean will always be higher than the median. 


. Whether the mean is higher or lower than the median depends on whether 


the data set is skewed to the right or to the left. [1 Mark] 


(ii) Which one of these X variables is a discrete random variable? 


A. 


B. 


An experiment in physics is repeated many times and X is the time required 
for a reaction to occur in seconds. 

A student is randomly selected and X is the number of correct answers on 
a five question multiple-choice quiz. 


. A Star track Express package is randomly selected and X is the weight in 


grams of the package. 


. A student is randomly selected and X is the distance they must travel in 


metres to go from their college room door to the door of their 8 am class on 
Monday morning. [1 Mark] 


(iii) Which of the following relationships could be analyzed using a chi-square test? 


A. 


B. 


The relationship between height (cm) and weight (kg). 


The relationship between satisfaction with secondary schools (satisfied or 
not) and political party affliation. 


. The relationship between gender and amount willing to spend on a stereo 


system (in dollars). 


. The relationship between opinion on migration and income earned last year 


(in thousands of dollars). [1 Mark] 
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(iv) Which one of the following choices describes a problem for which an analysis of 
variance would be appropriate? 


A. 


B. 


Comparing the proportion of successes for three different treatments of anx- 
iety. Each treatment is tried on 100 patients. 


Analyzing the relationship between high school GPA and college GPA. 


. Comparing the mean birth weights of newborn babies for three different 


racial groups. 


. Analyzing the relationship between gender and opinion about capital pun- 


ishment (favor or oppose). [1 Mark] 


(v) Which one of the following probabilities is a cumulative probability? 


A. 


B. 


The probability that there are exactly 4 people with Type O+ blood in a 
sample of 10 people. 


The probability of exactly 3 heads in 6 flips of a coin. 


. The probability that the accumulated annual rainfall in a certain city next 


year, rounded to the nearest millimetre, will be 450 mm. 


. The probability that a randomly selected woman’s height is 1.7 metres or 


less. [1 Mark/ 


(vi) Researchers would like to compare meditation and exercise to see which is more 
effective for reducing stress. One hundred people who suffer from high stress 
volunteer to participate in a study for ten weeks. Participants will either be given 
a 10-week course in meditation or will participate in a 10-week exercise program. 
The researchers must decide whether to randomly assign the volunteers to the 
two programs, or allow them to choose. 


Which of the following is the main advantage of randomly assigning participants 
to the two programs rather than allowing them to choose? 


A. 


B. 


The participants are more likely to stick with the program for the full 10 
weeks. 


Confounding variables, such as past practice of meditation, should be ap- 
proximately equal for the two groups. 


. Random assignment ensures that the two sample sizes are equal and that 


requirement is necessary in studies like this one. 


. Random assignment will allow the results to be extended to the population 


of all adults. [1 Mark] 
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(vii) A comparison is to be made between the proportion of Year 2 students that 


(viii) 


cannot read at Year 2 level and the proportion of Year 3 students that cannot 
read at Year 2 level. School records from schools across the New South Wales 
are collected and records for 123 Year 2 students and 146 Year 3 students are 
randomly selected. Of the sampled Year 2 students, 25 do not read at Year 2 
level. Of the sample Year 3 students, 26 do not read at Year 2 level. 


What is the correct notation for the difference 25/123 - 26/146? 


A. [a — He 
B. X 1 — Xo 
C. pi — pr 
D. pi — po [1 Mark] 


The average time taken to complete an exam follows a Normal probability dis- 
tribution with mean = 90 minutes and standard deviation = 45 minutes. 


What is the probability that a randomly chosen student will take more than 45 
minutes to complete the exam? 


A. 0.16 
B. 0.50 
C. 0.84 


D2: 0.997 [1 Mark] 


A statistics unit has 5 tutors: three female assistants (Lauren, Rona, and Leila) 
and two male assistants (Josh and Lorcan). Each tutor teaches one tutorial 
group. A student selects a tutorial group. 


The two events W = {the tutor is a woman} and J = {the tutor is Josh} are 


A. simple events. 
B. complementary events. 
C. mutually exclusive events. 


D. independent events. [1 Mark] 
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(x) If the confidence level for a confidence interval is increased, which of the following 
must also increase? 


A. 


B. 


Cr 


D. 


interval width 
sample estimate 
standard error 


none of the elements would be increased [1 Mark] 


(xi) A sample of 10 plants was taken and the mean height was 80.15 cm, with a 
sd of 7.10 cm. A 95% confidence interval for the true mean height of plants of 
that particular species is (75.1 cm, 85.2 cm). Four students gave the following 
interpretations of the confidence interval. Which of the following is correct? 


A. 


B. 


We are 95% confident that the interval (75.1 cm, 85.2 cm) will include the 
true mean height. 


We are 95% confident that the true mean height is 80.15 cm since that value 
lies in the confidence interval. 


. We can be confident that 95% of all plants of that species have a height 


between 75.1 cm and 85.2 cm. 


. None of the above. [1 Mark] 


(xii) Which of the following studies describes a paired data design? 


A. 


B. 


The testosterone levels of male doctors and male college professors are com- 
pared. 


40 students measure their blood pressure twice — first while resting and then 
again after running in place for 10 minutes. 


. The mean blood pressure of men is compared to the mean blood pressure 


of women. 


. All of the above involve paired data. [1 Mark] 
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(xiii) Women’s height follows a Normal distribution with mean = 165.1 cm and stan- 
dard deviation = 6.9 cm. Jill is 154.9 cm tall. If you convert Jill’s height to a 
z-score, which is closest to the correct answer? 


A. z=—-1.96 
B. z=—-1.11 
C. z= —1.48 
De. =1,96 [1 Mark] 


(xiv) Which of the following best describes the standardized z score for an observation? 


A. It is the center of the list of scores from which the observation was taken. 
B. It is one standard deviation more than the observation. 
C. It is the most common score for that type of observation. 


D. It is the number of standard deviations the observation falls from the mean. 
[1 Mark] 


(xv) Dr Richard Hurt and his colleagues (Hurt el al. 1994) randomly assigned volun- 
teers wanting to quit smoking to wear either a nicotine patch or a placebo patch 
to determine whether wearing a nicotine patch improves the chance of quitting 
smoking. After 8 weeks of use, 46% of those wearing the nicotine patch but only 
20% of those wearing the placebo patch had quit smoking. This difference was 
statistically significant. 


Which statement is supported by the evidence? 
A. While the nicotine patch may be effective, we can not be sure because this 
is an obserservational study. 


B. Given the study design and the statistical significance, this study demon- 
strates that the nicotine patch improves the chance of quitting smoking. 


C. Had there been a control group, this study may have been useful, but the 
poor design means we cannot draw definitive conclusions. 


D. This demonstrates that people wanting to quit smoking who use a nicotine 
patch will usually be able to. [1 Mark] 
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Question 2 [4 Marks] 


Toxemia is a dangerous condition of pregnancy more likely to occur in diabetic mothers 
than non-diabetics. About 2% of the female population are diabetic and of these 25% 
are likely to develop toxemia during pregnancy. In the non-diabetic female population, 
4.6% develop toxemia during pregnancy. 


(i) Consider a randomly selected pregnant woman. Draw a tree diagram represent- 
ing the four possible outcomes for this woman and attach to the diagram the 
probabilities for each branch and outcome. [2 Marks] 


(ii) What is the probability that a pregnant woman develops toxemia? [1 Mark] 


P(toxemia): 


(iii) What is the probability that a woman developing toxemia is non-diabetic? 
[1 Mark] 


P(non-diabetic | toxemia): 
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Question 3 [5 Marks] 


Seasonal factors such as the incidence of infectious agents during gestation may affect 
handedness. The work of Martin and Jones (1999, Cortex 35:123-128) suggests that 
there may be a lower incidence of left-handedness for people born between February 
and August (within the southern hemisphere). Data collected from statistics students 
is summarized below (LH = left-handed, RH = right-handed): 


Born LH RH 


Feb-Aug 28 226 
Sep-Jan 42 172 


Total 70 398 


For (i) and (ii), circle the correct answer. 


(i) At the a = 0.05 level of significance, the critical value of x? is 3.84, while at the 
a = 0.01 level of significance, the critical value of y? is 6.63. The value of the 
chi-square statistic for this study is 6.76. Which statement about the p-value for 
the x? test of an association between season of birth and handedness is true? 


A. The p-value is greater than 0.05 
B. The p-value is between 0.01 and 0.05 
C. The p-value is less than 0.01 


D. None of the above [1 Mark] 


(ii) The odds ratio of being left-handed for people born in Feb-Aug relative those 
born from Sep-Jan is closest to: 


A. 0.12 
B. 0.51 
C. 1.56 


D3 L0F [1 Mark] 
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(iii) Is there evidence of a decreased incidence of left-handedness amongst statistics 
students born between February and August (in the southern hemisphere)? Jus- 
tify your answer. [2 Marks] 


(iv) A scientist speculates that any seasonal pattern to handedness would be reversed 
in the northern hemisphere due to the seasonal nature of infectious agents, and 
conducts a similar study in England. If the pattern was indeed reversed, would 
this prove that infectious agents cause handedness? Justify your answer. 

[1 Mark] 
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Question 4 [8 Marks] 


Many patients undergoing treatment for substance abuse begin taking drugs again 
within 12 months, and one measure of a successful treatment is based on the proportion 
of patients who are drug-free 12 months after treatment. In a study of the effect of 
EEG biofeedback training on drug treatment (Scott et al. 2005, The American Journal 
of Drug and Alchohol Abuse), patients were randomly assigned to a control treatment 
(the Minnesota Model 12-step program) or an experimental treatment (the Minnesota 
Model 12-step program plus EEG biofeedback training). In the control group, 12 of 27 
patients were drug-free 12 months after treatment, while, in the experimental group, 
36 of 47 patients were drug-free 12 months after treatment. You may assume that the 
patients in the study are a representative random sample of the population of possible 
patients and that the experiment was conducted properly. 


(i) Estimate the proportion of patients receiving the experimental treatment who 
were drug-free after 12 months (pz), the proportion of patients receiving the 
control treamtent who were drug-free after 12 months (pc), and the difference 
between the proportions (pg — pc) [2 Marks] 


Estimated proportion (experimental treatment): 


Estimated proportion (control treatment): 


Estimated difference (experimental versus control): 


Question 4 is continued on page 11 
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(ii) Estimate the standard error for pg — pc [2 Marks] 


Estimated standard error (@__c¢): 


(iii) Calculate a 95% confidence interval for the difference using z* = 1.96. /2 Marks/ 


Confidence interval: 


(iv) Is the experimental treatment more effective for drug treatment than the control? 
Justify your answer. [2 Marks] 
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Question 5 [8 Marks] 
A snowmelt-related erosion index (y) was modelled as a linear function of snowmelt 
runoff (zx, in mm) using data from n = 36 climatological stations across Canada where 


runoff amount ranged from 116 to 354 mm and the erosion index ranged from 3 to 608. 


R output for the regression is given below: 


Coefficients: 

Estimate Std. Error t value Pr(>|t|) 
(Intercept) 121.5939 85.9083 1.415 0.16605 
x 1.1270 0.3653 3.085 0.00403 ** 


Signif. codes: O *** 0.001 ** 0.01 * 0.05 . 0.1 1 
Residual standard error: 130 on 34 degrees of freedom 


Multiple R-squared: 0.2187, Adjusted R-squared: 0.1958 
F-statistic: 9.519 on 1 and 34 DF, p-value: 0.004025 


(i) Write down the regression equation relating the erosion index to runoff. /1 Mark/ 


Equation: 


(ii) A number of conditions must be met for a regression to be valid. One of those 
conditions is that there is constant variance. In the space below, sketch an ex- 
ample of x-y data where the constant variance assumption is not met. /1 Mark] 


600 + 


400 + 


200 + 


50 100 150 200 250 300 350 
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For questions (iii) through (viii), assume that all regression assumptions have 
been met. 


(iii) Is there statistical evidence that runoff is related to erosion? Justify your answer. 
[1 Mark] 


(iv) Is runoff a precise predictor of erosion? (Hint: interpret the R? statistic.) 
[1 Mark] 


(v) Predict the erosion index when runoff is 300 mm. [1 Mark] 


Prediction: 


Question 5 is continued on page 14 
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(vi) One climatological station with runoff = 300 mm had an erosion index of 420. 


Residual: 


Confidence interval: 


Calculate the residual for this station. [1 Mark] 
(vii) Calculate the 95% confidence interval for the slope. Use t* = 2.03. [1 Mark] 
(viii) Interpret the confidence interval from (vii) [1 Mark] 
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Question 6 [5 Marks] 


The government is concerned at the size of student loan debt. A pilot sample of 20 ran- 
dom people across Australia with student loans had a mean loan balance outstanding 
= $36,169 with a standard deviation s = $10,352 


(i) Find the estimate for the standard error of the mean. [1 Mark] 


Standard error: 


(ii) If the government wanted to conduct a new study that had twice the precision, 
or half the standard error, of the pilot study, (approximately) how large should 
the sample size for the new study be? (Circle the answer.) [1 Mark] 


A. 10 people 
B. 40 people 
C. 80 people 


D. 160 people 


(iii) How many degrees of freedom are associated with the t-statistic for this study? 
[1 Mark] 


df: 


Question 6 is continued on page 16 
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(iv) Construct a 95% confidence interval for the mean student loan balance based on 
this pilot sample. (Use t* = 2.09) [1 Mark] 


95% confidence interval: 


(v) A different study compares student loan debt between students living in New 
South Wales and those living in Queensland. The standard deviation for those 
living in New South Wales was sysw = $9, 367, while the standard deviation for 
those living in Queensland was sg = $12,411. Would it be acceptable to use the 
pooled standard deviation when calculating the difference in student load debt 
between New South Wales and Queensland students? Justify your answer. 

[1 Mark] 
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Question 7 [5 Marks] 


Two species of small marsupials (brush-tailed mulgara and pilbara ningaui) can use 
one of four plant types (type A, type B, type C, type D) for cover to protect against 
predators. Attendance patterns for the two marsupial species at the different plant 
types is presented in the following table: 


Plant Type 
Species A B C D 
mulgara 10 17 21 37 
ningaui 12 8 15 10 


Pearson’s Chi-squared test 


data: GS.Table 
X-squared = 8.4221, df = ___, p-value = 0.03805 


(i) State (in words) the null hypothesis for this Chi-Square Test. [1 Mark] 


(ii) State (in words) the alternative hypothesis for this Chi-Square Test. /1 Mark] 


Question 7 is continued on page 18 
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(iii) Find the missing degrees of freedom for this Chi-Square Test. [1 Mark] 


df: 


(iv) What is the expected number of mulgara who would associate with plant type A 
under the null hypothesis? [1 Mark] 


Expected count: 


(v) The p-value for this test is 0.038. Interpret. [1 Mark] 
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Formulae are given on page 20 


Please remember: This examination question paper MUST BE HANDED IN. Failure 
to do so may result in the cancellation of all marks for this examination. Writing your 
name and number on the front will help us confirm that your paper has been returned. 
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Formulae 
P(A B 
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